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ABSTRACT 

This paper focuses on checking the correctness and robustness of 
the AT command interface exposed by the cellular baseband proces¬ 
sor through Bluetooth and USB. A device’s application processor 
uses this interface for issuing high-level commands (or, AT com¬ 
mands) to the baseband processor for performing cellular network 
operations (e.g., placing a phone call). Vulnerabilities in this inter¬ 
face can be leveraged by malicious Bluetooth peripherals to launch 
pernicious attacks including DoS and privacy attacks. To identify 
such vulnerabilities, we propose ATFuzzer that uses a grammar- 
guided evolutionary fuzzing approach which mutates production 
rules of the AT command grammar instead of concrete AT com¬ 
mands. Empirical evaluation with ATFuzzer on 10 Android smart¬ 
phones from 6 vendors revealed 4 invalid AT command grammars 
over Bluetooth and 13 over USB with implications ranging from 
DoS, downgrade of cellular protocol version (e.g., from 4G to 3G/2G) 
to severe privacy leaks. The vulnerabilities along with the invalid 
AT command grammars were responsibly disclosed to affected 
vendors. Among the vulnerabilities uncovered, 2 CVEs (CVE-2019- 
16400 and CVE-2019-16401) have already been assigned for the DoS 
and privacy leaks attacks, respectively. 

CCS CONCEPTS 

• Security and privacy —* Mobile and wireless security; Dis¬ 
tributed systems security, Denial-of-service attacks. 
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1 INTRODUCTION 

Modern smartphones operate with two interconnected processing 
units—an application processor for general user applications and a 
baseband processor (also known as a cellular modem) for cellular 
connectivity. The application processor can issue ATtention com¬ 
mands (or, AT commands) [22] through the radio interface layer 
(RIL, also called AT interface) to interact with the baseband pro¬ 
cessor for performing different cellular network operations (e.g., 
placing a phone call). Most of the modern smartphones also accept 
AT commands once connected via Bluetooth or USB. 

Problem and scope. Since the AT interface is an entry point for 
accessing the baseband processor, any incorrect behavior in pro¬ 
cessing AT commands may cause unauthorized access to private in¬ 
formation and inconsistent system states and crashes of the RIL dae¬ 
mon and the telephony stack. This paper thus focuses on the follow¬ 
ing research question: Is it possible to develop a systematic approach 
for analyzing the correctness and robustness of the baseband-related 
AT command execution process to uncover practically-realizable vul¬ 
nerabilities? Incorrect execution of AT commands may manifest in 
one of the following forms: (1) Syntactic errors-, the device accepts 
and processes syntactically invalid AT commands; and (2) Semantic 
violations : the device processes syntactically correct AT commands, 
but does not conform to the prescribed behavior. Successful ex¬ 
ploitations of such invalid commands may enable malicious periph¬ 
eral devices (e.g., a headset), connected to the smartphone over 
Bluetooth, to access phones’s sensitive information, such as, IMSI 
(International Mobile Subscriber Identity, unique to a subscriber) 
and IMEI (International Mobile Equipment Identity, unique to a 
device), or to downgrade the cellular protocol version or stop the 
cellular Internet, even when the peripheral is only allowed to access 
phone’s call and media audio. 

Prior efforts. The previous research [43, 44, 53] strives to iden¬ 
tify the types of valid AT commands (i.e., commands with valid 
inputs/arguments conforming to 3GPP reference [1, 9, 11, 12, 52] 
or vendor-specific commands [13, 14, 18, 21] added for vendor cus¬ 
tomization) exposed through USB interfaces on modern smartphone 
platforms and the functionality they enable. Yet these studies have 
at least one of the following limitations: (A) The analyses [53] do 
not test the robustness of the AT interface in the face of invalid 



commands; (B) The analyses [53] only consider USB interface and 
thus leave the Bluetooth interface to the perils of both valid and 
invalid AT commands; and (C) The analyses [30, 43, 45, 49] are 
not general enough to be applicable to smartphones from different 
vendors. 

Challenges. Conceptually, one can approach our problem using 
one of the following two techniques: (1) static analysis; (2) dynamic 
analysis. As the source code of firmwares is not always available, a 
static analysis-based approach would have to operate on a binary 
level. The firmware binaries, when available, are often obfuscated 
and encrypted. Making such binaries amenable to static analysis 
requires substantial manual reverse engineering effort. To make 
matters worse, such manual efforts are often firmware-version spe¬ 
cific and may not apply to other firmwares, even when they are 
from the same vendor. Using dynamic analysis-based approaches 
also often requires instrumenting the binary to obtain coverage 
information for guiding the search. Like static analysis, such instru¬ 
mentation requires reverse engineering effort which again is not 
scalable. Also, during dynamic analysis, due to the separation of 
the two processors, it is often difficult to programmatically detect 
observable RIL crashes from the application processor. Finally, in 
many cases, undesired AT commands are blacklisted [2-4] and 
hence can cause rate-limiting by completely shutting down the AT 
interface. The only way to recover from such a situation is to re¬ 
boot the test device which can substantially slow down the analysis 
process. 

Our approach. In this paper, we propose ATFuzzer which can test 
the correctness and robustness of the AT interface. One of the key 
objectives driving the design of ATFuzzer is discovering problematic 
input formats instead of just some misbehaving concrete AT com¬ 
mands. Towards this goal, ATFuzzer employs a grammar-guided 
evolutionary fuzzing-based approach. Unlike typical mutation-based 
[28, 35-37, 46, 59] and generation-based [20, 31, 54] fuzzers, ATFuzzer 
follows a different strategy. It mutates the production rules of the AT 
command grammars and uses sampled instances of the generated 
grammar to fuzz the test programs. 

Such an approach has the following two clear benefits. First, a 
production rule (resp., grammar) describing a valid AT command 
can be viewed as a symbolic representation for a set of concrete AT 
commands. Such a symbolic representation enables ATFuzzer to 
efficiently navigate the input search space by generating a diverse 
set of concrete AT command instances for testing. The diversity of 
fuzzed input instances is likely achieved because mutating a gram¬ 
mar can move the fuzzer to test a different syntactic class of inputs 
with high probability. Second, if ATFuzzer can find a problematic 
production rule whose sampled instances can regularly trigger an 
incorrect behavior, the production rule can then be used as a shred 
of possible abstract evidence which can contribute towards the 
identification of the underlying flaw causing the misbehavior. 

ATFuzzer takes grammars of AT commands as the seed. It then 
generates the initial population of grammars by mutating the seed 
grammars. For each generated grammar, ATFuzzer samples grammar- 
compliant random inputs and evaluate the fitness of each gram¬ 
mar based on our proposed fitness function values of the samples. 
Since code-coverage or subtle memory corruptions are not suitable 
to be used as the fitness function for such vendor-specific closed 
firmwares, we leverage the execution timing information of each 


AT command as a loose-indicator of code-coverage information. 
Based on the fitness score of each grammar, ATFuzzer selects the 
parent grammars for crossover operation. We design a grammar- 
aware two-point crossover operation to generate a diverse set of 
valid and invalid grammars. After the crossover with a random 
probability, we incorporate three proposed mutation strategies to 
include randomness within the grammar itself. The intuition be¬ 
hind using both crossover and mutation operations is for testing the 
integrity of each command field as well as the command sequence. 
Findings. To evaluate the generality and effectiveness of our ap¬ 
proach, we evaluated ATFuzzer on 10 Android smartphones (from 6 
different vendors) with both Bluetooth and USB interfaces. ATFuzzer 
has been able to uncover a total of 4 erroneous AT grammars over 
Bluetooth and another 13 AT grammars over USB. Impacts of these 
errors range from complete disruption of cellular network con¬ 
nectivity to retrieval of sensitive personal information. We show 
practical attacks through Bluetooth that can downgrade or shut¬ 
down of Internet connectivity, and also enable illegitimate exposure 
of IMSI and IMEI when such impacts are not possible to achieve 
using valid AT commands. On top of that, the syntactically and se¬ 
mantically flawed AT commands over USB may also cause crashes, 
compound actions, and returns OK/ERROR while still getting pro¬ 
cessed. For instance, an invalid AT command ATDI in LG Nexus 
5 induces the program to execute two valid AT commands— ATD 
(dial) and ATI (display IMEI), simultaneously. These anomalies add 
a new dimension to the attack surface when blacklisting or access 
control mechanisms are put in place to protect the devices from 
valid yet menacing AT commands. The vulnerabilities along with 
the invalid AT command grammars were responsibly disclosed to 
affected vendors. Among the vulnerabilities unearthed, 2 CVEs 
(CVE-2019-16400 [5] and CVE-2019-16401 [6]) have already been 
assigned. 

Contributions. The paper has the following contributions: 


(1) We propose ATFuzzer— an automated and systematic frame¬ 
work that leverages grammar-guided genetic programming 
for dynamic testing of the AT command interface in modern 
Android smartphones. We have made our framework open- 
source alongside the corpus of AT command grammars we 
tested. The tool and its detailed documentation are publicly 
available at https://github.com/Imtiazkarimik23/ATFuzzer 

(2) We show the effectiveness of our approach by uncovering 4 
problematic AT grammars through Bluetooth and 13 prob¬ 
lematic grammars through USB interface on 10 smartphones 
from 6 different vendors. 

(3) We demonstrate that all the anomalous behavior of the AT 
program exposed through Bluetooth are exploitable in prac¬ 
tice by adversaries whereas the anomalous behavior of AT 
programs exposed through USB would be effectively ex¬ 
ploitable even when valid but unsafe AT commands are 
blacklisted. The impact of these vulnerabilities ranges from 
private information exposure to persistent denial-of-service 
attacks. 




(a) USB (b) Bluetooth 


Figure 1: AT Interface for Android Smartphones connected to a host 
machine through USB interface 

2 BACKGROUND 

We now give a brief primer on AT commands. We then discuss how 
we obtain the list of a smartphone-supported AT commands and 
their respective grammars. 

2.1 AT Commands 

Along with the AT commands defined by the cellular standards [11], 
vendors of cellular baseband processors and operating systems sup¬ 
port vendor-specific AT commands [17, 18, 21] for testing and 
debugging purposes. Based on the functionality, different AT com¬ 
mands have different formats; differing in number and types of 
parameters. The following are the four primary uses of AT com¬ 
mands: (i) Get/read a parameter value, e.g., AT + CFUN? returns the 
current parameter setting of +CFUN which controls cellular function¬ 
alities; (ii) Set/write a parameter, e.g., AT + CFUN = 0 turns off (on) 
cellular connectivity (airplane mode); (iii) Execute an action, e.g., 
ATH causes the device to hang up the current phone call; (iv) Test 
for allowed parameters, e.g., AT + CFUN =? returns the allowed pa¬ 
rameters for +CFUN command. Note that, +CFUN is a variable which 
can be instantiated with different functionality (e.g., +CFUN=1 refers 
to setting up the phone with full functionality). 

2.2 AT Interfaces for Smartphones 

AT commands can be invoked by an application running on the 
smartphone, or from another host machine or peripheral device 
connected through the smartphone’s USB or Bluetooth interface 
(shown in Figure 1). While older generations of Android smart¬ 
phones used to allow running AT commands from an installed 
application, recent Android devices have restricted this feature to 
prevent arbitrary applications from accessing device’s sensitive re¬ 
sources illegitimately through AT commands. Contrary to installed 
applications, nearly all Android phones allow executing AT com¬ 
mands over Bluetooth, whereas, for USB, devices require minimal 
configuration to set up to activate this feature. Android smart¬ 
phones typically have different parsers for executing AT commands 
over these interfaces. 

2.3 Issuing AT Commands Over Bluetooth and 
USB 

In this section, we present the details pertaining to issuing AT 
commands over Bluetooth and USB. 

2.3.1 Bluetooth. For executing AT commands over Bluetooth, 
the injecting host machine/peripheral device needs to be paired 


with the Android smartphone. The Bluetooth on a smartphone may 
have multiple profiles (services), but only certain profiles e.g. hands¬ 
free profile (HFP), headset profile (HSP) supports AT commands. 
Figure 1 (right) shows the flow of AT command execution over 
Bluetooth. 

When a device is paired with the host machine, it establishes and 
authorizes a channel for data communication. After receiving an 
AT command, the system-level component of the Bluetooth stack 
recognizes the AT command with the prefix "AT" and compares 
it against a list of permitted commands (based on the connected 
Bluetooth profile). When the parsing is completed, the AT command 
is sent to the application-level component of the Bluetooth stack 
in user space where the Bluetooth API takes the action as per the 
AT command issued. Similar to the example through USB, if a 
baseband related command is invoked e.g.,ATD <phone_no>; , the 
RILD is triggered to deliver the command to the baseband processor. 
Contrary to USB, only a subset of AT commands related to specific 
profiles is accepted/processed through Bluetooth. 

2.3.2 USB. If a smartphone exposes its USB Abstract Control 
Model (ACM) interface [49], it creates a tty device such as /dev/tty- 
ACMO which enables the phone to receive AT commands over the 
USB interface. On the other hand, in phones for which the USB 
modem interface is not included in the default USB configuration, 
switching to alternative USB configuration [49] enables communi¬ 
cation to the modem over USB. The modem interface appears as 
/dev/ttyACM* device node in Linux whereas it appears as a COM* 
port in Windows. Figure 1 (left) shows the execution path of an AT 
command over USB. 

When the AT command injector running on a host machine sends 
a command through /dev/ttyACM* or COM* to a smartphone, the 
ttyABC (ABC is a placeholder for actual name of the tty device) de¬ 
vice in the smartphone receives the AT command and relays it to 
the native daemon in the Android userspace. The native daemon 
takes actions based on the type of command. If the command is 
related to baseband, e.g., ATD <phone_no>;, the RILD (Radio In¬ 
terface Layer Daemon) is triggered to deliver the command to the 
baseband processor which executes the command- makes a phone 
call to the number specified by <phone_no>. On the other hand, if 
the command is operating system-specific (e.g., Android, iOS, or 
Windows), such as +CKPD for tapping a key, the native daemon 
does not invoke RILD. 

2.4 AT Commands and Their Grammars 

We obtain the list of valid AT commands and their grammars from 
the 3GPP standards [1, 8-12, 52], Note that, not every standard AT 
commands are processed/recognized by all smartphones. This is 
because different smartphone vendors enforce different whitelist¬ 
ing and blacklisting policies for minimizing potential security risks. 
Also, vendors often implement several undocumented AT com¬ 
mands. Any problematic input instances that ATFuzzer finds, we 
check to see whether it is one of the vendor-specific, undocumented 
AT commands following the approach by Tian et al. [53], We do 
not report as invalid the undocumented, vendor-specific AT com¬ 
mands that ATFuzzer discovers since they have already been docu¬ 
mented [53]. We aim at finding malformed AT command sets that 
are due to the parsing errors in the AT parser itself. 




































3 OVERVIEW OF OUR APPROACH 

In this section, we first present the threat model and then for¬ 
mally define our problem statement. Finally, we provide a high-level 
overview of our proposed mechanism with a running example. 

3.1 Threat Model 

For Bluetooth and USB AT interfaces exposed by modern smart¬ 
phones, we define the following two different threat models. 

3. 1. 1 Threat model for Bluetooth. For the threat model over Blue¬ 
tooth, we assume a malicious/compromised Bluetooth peripheral 
device (e.g. headphones, speaker) is paired to an Android device over 
Bluetooth. We assume the malicious Bluetooth device is connected 
through its default profile. For instance, the victim smartphone 
which is connected to the malicious headphone has only given 
audio permissions to the headphone. Also, there can be the case 
when the adversary sets up a fake peripheral device through the 
man-in-the-middle (MitM) attacks exploiting known vulnerabilities 
of Bluetooth pairing and bonding [7, 39, 51] procedures. We do not 
assume the presence of malicious apps on the device. 

3.1.2 Threat model for USB. For USB, we assume a malicious USB 
host, such as a PC or USB charging station controlled by an ad¬ 
versary, that tries to attack the connected Android phone via USB. 
We assume the attacker can get access to the exposed AT interface 
even if the device is inactive. We also neither require a malicious 
app to be installed on the device nor the device’s USB debugging to 
be turned on. 

3.2 Problem Statement 

Let I be the set of finite strings over printable ASCII characters, 
7? = {ok, error} denoting the parsing status, and SfK be a set of 
actions (e.g., phone-call € SR). The AT interface of a smartphone 
can be viewed as a function P from I to R X 2^ J?lu { no P- J -(\ that 
is, P : I -> ft X 2<- ?lu { no P- L » in which nop refers to no operation 
whereas _L captures undefined behavior including a crash, nop 
is used to capture the behavior of P ignoring an AT command, 
possibly, due to blacklisting or parsing errors. 

Given the smartphone AT interface under test ’Pjest and a refer¬ 
ence AT interface induced by the standard P^, we aim to identify 
concrete vulnerable AT command instances s 6 I such that 'Pxest 
and P Rgf do not agree on their response for s, that is, 'pRef(s) A 
fVest(s)- Given pairs (n, af), (rz, af) e jjx2^ u ^ nop,i ^, we write 
(ri, a\) = (r 2 , af) if and only if ri = ("2 and a\ = a^. Note that, a\ 
and «2 are both sets of actions as one command can mistakenly 
trigger multiple actions. 

Note that, there can be a reason P^ and PTest can legitimately 
disagree on a specific input AT command s S I as s can be black¬ 
listed by Pjest- Due to CVE-2016-4030 [2], CVE-2016-4031 [3], and 
CVE-2016-4032 [4], Samsung has locked down the exposed AT 
interface over USB with a command whitelist for some phones. 
In this case, we do not consider s to be a vulnerable input in¬ 
stance. Precisely, when s is a blacklisted command, we observed 
that Pjest often returns (ok, nop). Finally, we instantiate the oracle 
'PRef through manual inspection of the standard. 


3.3 Running Example 

To explain ATFuzzer’s approach, we now provide a partial, example 
context-free grammar (CFG) of a small set of AT commands (see 
Figure 3 for the grammar and Figure 2 for the partial Abstract Syntax 
Tree (AST) of the grammar) which we adopted from the original 
3GPP suggested grammar [11, 52]. In our presentation, we use the 
bold-faced font to represent non-terminals and regular-faced font 
to identify terminals. We use to represent explicit concatenation, 
especially, to make the separation of terminals and non-terminals in 
a production rule clear. We use [. . . ] to define regular expressions 
in grammar production rules and [...]* to represent the Kleene 
star operation on a regular expression denoted by [. . . ]. In our 
example, Dnum can take as an argument any alphanumeric string 
up to length n. Our production rules are of the form: s —» a ■ Bi {(/>} 
where a denotes a terminal, Bf represents a non-terminal, and f 
represents a condition that imposes additional well-formedness 
restrictions on the production. 

In the above example, we show the correct AT command format 
for making a phone call. Examples of valid inputs generated from 
this grammar can be— ATD * 752# + 436644101453; 

3.4 Overview of ATFuzzer 

In this section, we first touch on the technical challenges that 
ATFuzzer faces and how we address them. We conclude by pro¬ 
viding the high-level operational view of ATFuzzer. 

3.4.1 Challenges. For effectively navigating the input search space 
and finding vulnerable AT commands, ATFuzzer has to address the 
following four technical challenges. 

Cl: (Problematic input representation). The first challenge is to ef¬ 
ficiently encode the pattern of problematic inputs. It is crucial as 
the problematic AT commands that have similar formats/structures 
but are not identical may trigger the same behavior. For instance, 
both ATD 123 and ATD 1111111111 test inputs are problematic (nei¬ 
ther of them is a compliant AT command due to missing a trailing 
semicolon) and have a similar structure (i.e., ATD followed by a 
phone number), but are not the same concrete test inputs. While 
processing these problematic AT commands, one of our test devices, 
however, stops cellular connectivity. Mutation in the concrete input 
level will require the fuzzer to try a lot of inputs of the same vulner¬ 
able structure before shying away from that abstract input space. 
This may limit the fuzzer from testing diverse classes of inputs. 
C2: (Syntactic correctness). As shown in Figure 3, most of the AT 
commands have a specific number and type of arguments, e.g., 
+CFUN= has two arguments: CFUNargl and CFUNarg2. The 
second challenge is to effectively test this structural behavior and 
type, thoroughly by generating diverse inputs that do not comply 
with the command structure or the argument types. 

C3: (Semantic correctness). Each argument of an AT command may 
have associated conditions. For instance, Length) Dnum) < n in 
the fifth production rule of Figure 3. Also, arguments may correlate 
with each other, such as, one argument defines a type on which 
another argument is dependent. For instance, +CTFR= refers to 
a service that causes an incoming alert call to be forwarded to a 
specified number. It takes four arguments— the first two of them 
are number and type. Interestingly, the second argument defines 
the format of the number given as the first argument. If the dialing 



command 



Figure 2: Partial Abstract Syntax Tree(AST) of the reference grammar (Grey-box denotes non-terminal symbols and white box indicates ter¬ 
minal symbols) 


command —> AT • cmd 

cmd —> dgrammar| cfungrammar 
cmd —» e | cmdAT 
cmd AT —> cmd ; cmd 
dgrammar —» D • Dnum • Darg ; 
cfungrammar —» +CFUN? | +CFUN =CFUNargl, CFUNarg2 
cmd —» +CTFR = number, type, subaddr, satype 
Dnum —» [a — zA — ZO - 9 + *#]* {Len^tZi(Dnum) < n} 
Darg -> I | G | e 

CFUNargl -> [0 - 9]* {CFUNargl € Z and 0 < CFUNargl < 127} 
CFUNarg2 -> [0 - 1] {CFUNarg2 € Z and 0 < CFUNarg2 < 1} 
number] —> number] | number 2 
number 2 —» [a - zA — ZO — 9 + *#]* {if type = 145} 
number —> [a - zA - ZO - 9 * #]* {if type =129} 
type —> 1451129 

subaddr —» [a — zA — ZO - 9 + *#]* 
satype —> [0 — 9]* {if satype = e, satype = 128} 


Figure 3: Partial reference context-free grammar for AT commands, 
string includes access code character then the type should be 
145, otherwise, it should be 129. These correlations are prevalent in 
many AT commands. Hence, the third challenge is to systematically 
test conditions associated with the arguments of commands to 
cover both syntactical and semantic bugs. 

C4: (Feedback of a test input). The AT interface can be viewed as 
a black-box providing only limited output of the form: OK (i.e., 
correctly parsed) or ERROR (i.e., parsing error). The final challenge 
is to devise a mechanism that can provide information about the 
code-coverage of the AT interface for the injected test AT command 
and thus effectively guides us through the fuzzing process. 

3.4.2 Insights on addressing challenges. For addressing Cl, we use 
the grammar itself as the seed of our evolutionary fuzzing frame¬ 
work rather than using a particular instance (i.e., a concrete test 
input) of the grammar. This is highly effective as the mutation of a 
production rule can influence the fuzzer to test a diverse set of in¬ 
puts. Also, when a problematic grammar is identified, it can serve as 
abstract evidence of the underlying flaw in the AT interface. Finally, 
as grammar can be viewed as a symbolic representation of concrete 
input AT commands, mutating a grammar can enable the fuzzer 
to cover large diverse classes of AT commands. The insight here 
is that testing diverse input classes are likely to uncover diverse 
types of issues. 

To address challenges C2 and C3, at each iteration, ATFuzzer 
chooses parents with the highest fitness scores and switches parts 
of the grammar production rules among each other. This causes 


changes to not only the structural and type information in the child 
grammars but also forms two very different grammars that try to 
break the correlation of the arguments. For instance, suppose that 
the ATFuzzer has selected following two production rules from two 
different parent grammars: +CFUN = CFUNargl, CFUNarg2 
and +CTFR = number, type, subaddr, satype. After applying 
our proposed grammar crossover mechanisms, the resultant child 
grammar production rules are: +CFUN = CFUNargl, type, subaddr, 
satype and +CTFR = number CFUNarg2. The production rule 
+CFUN takes only two arguments whereas our new child gram¬ 
mar creates a production rule that has four arguments. The same 
reasoning also applies for +CTFR. Thus the new grammars with 
modified production rules would test this structural behavior pre¬ 
cisely. Furthermore, +CTFR’s first argument number is correlated 
with its second argument type. In the modified child grammars, 
type, however, has been replaced with CFUNarg2. Recall from our 
grammar definition, type takes argument from the set {145,129} 
whereas +CFUNarg2 takes argument from the set {0,1}. There¬ 
fore, this single operation completes two tasks at once— it not only 
tests the correlation among two arguments of +CTFR but also tests 
conditions of both +CFUN and +CTFR. Crossing over grammar 
production rules creates a higher change in the input format and 
it aims to explore the diverse portions of the input space to create 
highly unusual inputs. To test both the structural aspects, we use 
three very different mutation strategies which create little change 
to the grammar (compared to crossover) but prove highly effective 
for checking the conditions associated with commands. 

For addressing C4, we use the precise timing information of 
injecting an AT command and receiving output. We keep an upper 
bound on this time, i.e., a timer ( T ). If the output is not received 
within T, we suspect the AT interface has become unresponsive 
possibly due to the blacklisting mechanism enforced by several 
vendors. We use this timing information as a loose-indicator for 
the code-coverage information. Our intuition is to explore as much 
of the AT interface as possible. Higher timing loosely indicates 
that the test command traverses more basic-blocks than the other 
inputs with lower timing. We try to leverage this simple positive 
correlation to design a feedback edge (i.e., a fitness function) of 
the closed-loop. The timing information, however, cannot help to 
infer how many new basic-blocks a test input could explore. Since 
our focus is mainly on baseband related AT commands, an error 
in the AT interface has a higher probability of causing disruptions 
in the baseband which also trickles down to cellular connectivity. 
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Figure 4: Overview of ATFuzzer framework 


We leverage this key insight and consider both the cellular Internet 
connectivity information from the target device and the device’s 
debug information ( Logcat ) as an indication of the baseband health 
after running an AT command. Using this information, we devise 
our fitness function for guiding ATFuzzer. 

3.4.3 High-level description of ATFuzzer. ATFuzzer comprises of 
two modules, namely, evolution module and evaluation module , in¬ 
teracting in a closed-loop fashion (see Figure 4). The evolution 
module is bootstrapped with a seed AT command grammar which 
is mutated to generate P s j ze (refers to population size, a parame¬ 
ter to ATFuzzer), different versions of that grammar. Concretely, 
new grammars are generated from parent grammar(s) by ATFuzzer 
through the following high-level operations: (1) Population initial¬ 
ization; (2) Parent selection; (3) Grammar crossover; (4) Grammar 
mutation. Particularly relevant is the operation of parent selection 
in which ATFuzzer uses a. fitness function to select higher-ranked 
(parent) grammars for which to apply the crossover/mutation oper¬ 
ations (i.e., steps 3 and 4) to generate new grammars. Choosing the 
higher-ranked grammars to apply mutation is particularly relevant 
for generating effective grammars in the future. 

Evaluating fitness function requires the evaluation module. For 
a given grammar g, evaluation module samples several ^-compliant 
commands to test. It uses the AT command injector (as shown in 
Figures 1 and 4) to send these test commands to the device-under- 
test. The fitness function uses the individual scores of the concrete 
^-compliant instances to assign the score to g. 

4 ATFuzzer: DETAILED DESIGN 

In this section, we discuss our proposed crossover and mutation 
techniques for the evolution module followed by the fitness function 
design used by the evaluation module. 

4.1 Evolution Module 

Given the AT grammar (shown in Figure 3), ATFuzzer’s evolution 
module randomly selects at most n cmds to generate the initial 
seed AT grammar denoted as G^j. The evolution module yields the 
grammars Gb est with the highest scores until a certain stopping 


condition is met, such as, total time of testing or the number of 
iterations. Algorithm 1 describes the high-level steps of ATFuzzer’s 
evolution module. 


4.1.1 Initialization. The evolution module starts with initializing 
the population P (Line 1 in Algorithm 1) by applying both our 
proposed crossover and mutation strategies with three parame¬ 
ters: the population size P s ; ze ; the probability Pp 0p of applying 
crossover and mutation on the grammar; the tournament size T s j ze . 
The key-insight of using 'Ppop is that it correlates with the number 
of syntactic and semantic bugs explored. The higher the value of 
'Ppop is, the diverse the initial population is and vice versa. The di¬ 
verse the initial population is, the higher the number of test inputs 
that check syntactic correctness is and vice versa. Therefore, to ex¬ 
plore both syntactic and semantic bugs, we vary the values of Ppop 
aiming to strike a balance between grammar diversity. To assess 
the fitness, the evolution module invokes the evaluation module 
(Line 3-8) with the generated grammars. 

Algorithm 1: ATFuzzer 


Data: P s j ze , 'Ppop, C>aT. Ui ze 
Result: Gb e st : Best Grammar 

1 P <— InitializePopulation(P s j ze , 'Ppon, Gat); 

2 while stopping condition is not met do 

3 for each grammar G\ G P do 

4 Generate random input I 

5 AssesFitness(Gj, I) 

6 if Fitness(Gj) > Fitness(Gb es t) then 

7 | ^best = Gj; 

8 end 


9 

10 

11 

12 

13 

14 

15 

16 

17 

18 


end 

Q= 0 

for P f c times do 

P a <— ParentSelection(P, T s j ze ) 

Pb <— ParentSelection(P, T s j ze ) 

C a , Cb «— GrammarBasedCrossover(P a , Pb) 
Q = Q U {Mutate(C a ), Mutate(Cb)} 

end 

P<-Q 

end 


4. 1.2 Parent selection for the next round. We use the tournament 
selection technique to get a diverse population at every round. We 
perform “tournaments” among P grammars (Line 12-13 in Algo¬ 
rithm 1). The winner of each tournament (the one with the highest 









































































fitness score) is then selected for crossover and mutation. In what 
follows, we discuss in detail our tournament selection technique 
addressing the functional and structural bloating problems of evo¬ 
lutionary fuzzers [54]. 

Restraining functional bloating. We leverage another insight 
in selecting grammars at each round of the tournament selection 
procedure to reduce functional bloating [54]— the continuous gen¬ 
eration of grammars containing similar mutated production rules— 
which adversely affects diverse input generation in evolutionary 
fuzzing. At each round, we randomly select grammars from our 
population. This is due to the fact that while running an evolu¬ 
tionary fuzzing, the range of fitness values becomes narrow and 
reduces the search space it focuses on. For example, at any round, 
if the fuzzer finds a grammar that has a mutated production rule 
related to +CFUN causing an error state in the AT interface, then 
all the grammars containing this mutated rule will obtain high 
fitness values. If we then only select parents based on the highest 
fitness, we would inevitably fall into functional bloating and would 
narrow down our focused search space with grammars that are 
somehow associated with this mutated version of +CFUN only. 

To constraint this behavior, we perform the tournament selection 
procedure in which we randomly choose T s j ze (where T denotes 
the set of selected grammars for the tournament and T s j ze < P s j ze ) 
number of grammars from the population P. The key insight of 
choosing randomly is to give chances to the lower fitness grammars 
in the next round to ensure a diverse pool of candidates with both 
higher and lower fitness scores. 

Restraining structural bloating. After running ATFuzzer for a 
while, i.e., after a certain number of generations, the average length 
of individual grammar grows rapidly. This behavior is character¬ 
ized as structural bloating. Referring to the AT grammar in Figure 3, 
multiple cmds (production rules) can contribute to generating the 
final commands that are sent to the AT command injector for eval¬ 
uation. These commands can grow indefinitely, but do not induce 
any structural changes, and thus cause structural bloating. These 
input commands, therefore, hardly contribute to the effectiveness 
of the fuzzer. To limit this behavior, we restrict the grammar to 
have at most three cmds at each round to generate the input AT 
commands for testing. 



Figure 5: Examples of one-point and two-point grammar cossover 
mechanisms. 


4.1.3 Grammar crossover. In the grammar crossover stage, ATFuzzer 
strives to induce changes in the grammar aiming to systematically 
break the correlation and structure of the grammar. For this, we 
take inspiration from traditional genetic programming and apply 
our custom two-point crossover technique among the grammars. 
Two-point crossover. ATFuzzer picks up two random production 
rules from the given parent grammars and generates two random 
numbers cj and C 2 within l where t is the minimum length between 
the two production rules. ATFuzzer then swaps the fields of the two 
production rules that are between points Ci and C 2 . 


Figure 5 shows how ATFuzzer performs the two-point crossover 
operation on production rules +CTFN = and +CTFR = (a subset 
of the AT grammar in Figure 3) used for controlling the cellular 
functionalities and for urgent call forwarding, respectively. By ap¬ 
plying two-point crossover on +CFUN = CFUNargl, CFUNarg2 
and +CTFR = number, type, subaddr, satype, ATFuzzer gener¬ 
ates +CFUN = number, CFUNarg2 and +CTFR = CFUNargl, type, 
subaaddr, satype which in turn contribute in generating versatile 
inputs. 

Algorithm 2: Two-Point Grammar Crossover 

Data: ParentGrammar P a , ParentGrammar P^ 

Result: Pa.Pb 

1 Randomly pick production rule R a from Pa and Rfy from Py 

2 Ra < ^**2 . Ra[ 

3 Rb Rb \» Rb2 Rb m 

4 ci <— random integer chosen from 1 to min(l , m) 

5 C 2 <— random integer chosen from 1 to min(l, m) 

6 if c > d then 

7 | swap C\ and C 2 

8 end 

9 for i from c to d — 1 do 

10 | swap grammar rules of R a ^, Rfr. 

11 end 



Adding a production rule 


Figure 6: Example of three grammar mutation strategies. 


4.1.4 Grammar mutation. During crossover operation, ATFuzzer 
constructs grammars that may have diverse structures which are, 
however, not enough to test the constraints and correlations as¬ 
sociated with a command and its arguments. This is due to the 
fact that AT commands have constraints not only with the fields 
but also with the commands itself. Therefore, generating versa¬ 
tile grammars that can generate such test inputs is an important 
aspect of ATFuzzer design. To deal with this pattern, we propose 
three mutation strategies— addition, trimming, and negation. We 
use AT + CTFR = number, type, subaddr, satype (one of the exam¬ 
ple grammars presented in Figure 3) to illustrate these mutation 
strategies with examples shown in Figure 6. 

Addition. With our first strategy we randomly insert/add a field 
chosen from the production rule of the given grammar at a random 
location. For instance, applying this mutation strategy (shown in 
Figure 6) to one of the grammars +CTFR = number, type, subaddr, 
satype yields +CTFR = number, type, satype, subaddr, satype con¬ 
taining an additional argument added after the second argument 
of the actual grammar. The mutation also has changed the type 
(string) of third argument (subaddr) of the actual grammar to inte¬ 
ger (satype) in the new grammar. ATFuzzer, thereby, tests the type 
correctness along with the structure of the grammar. 

Trimming. Our second mutation strategy is to randomly trim an 
argument from a production rule for the given grammar. Referring 
to Figure 6, applying this to our example grammar for +CTFR, 
we obtain a production rule AT + CTFR = number, subaddr, satype 



























which also deviates from the original grammar with respect to both 
the structure and type. 

Negation. Our last mutation strategy focuses on the constraints 
associated with the arguments of a command. Referring to the AT 
grammar in Figure 3, we encode the constraints with additional 
conditions (denoted with {...}) in the grammar production rules. 
With the negation strategy, we randomly pick a production rule of 
the grammar and choose a random argument that has a condition 
associated with it. We negate the condition which we use to replace 
the original one at its original place in the production rule. Figure 6 
demonstrates how we negate the production rule associated with 
number used to represent a phone number. The number is a string 
type with a constraint on its length. We negate this condition with 
the following three heuristics: (i) Generating strings that are longer 
than the specified length; (ii) Generating strings that contain not 
only alphanumeric characters but also special characters; and (iii) 
Generating an empty string. 

Algorithm 3: Grammar Mutation 

Data: Grammar G a , Tunable parameters : P a , Pp, Py 
Result: Mutated G a 

1 Randomly pick production rule R a from G a 

2 Pa * Pa-[ > Ra2> •••» Raj 

3 c <— random integer chosen from 1 to / 

4 P <— Generate random probability from (0, 1) 

5 if P a > P then 

6 | trim argument R ac from production rule R a 

7 end 

8 if Pp > P then 

9 | replace P c {0} with P c {-^}in production rule R a 

to end 

11 if Py > P then 

12 I d <— random integer chosen from 1 to l 

13 add argument P ac at position l in production rule R a 

14 end 


4.2 Evaluation Module 

The primary task of the evaluation module is to generate a number 
of test inputs (i.e., concrete AT command instances) for the grammar 
received from the evolution module. It then evaluates the test inputs 
with the AT command injector, and finally evaluate the grammar 
itself based on the scores of the generated test inputs. To what 
follows we explain how the evaluation module calculates the fitness 
score of a grammar. 


between when the AT command is sent and when the output is 
received by the AT command injector. Note that, we normalize the 
execution time with input length. 

Let ti, t 2 , ..., tjy l> e the time for executing N AT commands, we 
define the timing score for instance x in a population of size N as 


follows: 


timing scor e 


ti 

ti + t2 + + t fsj 


Note that while running AT commands over Bluetooth, the com¬ 
mands and their responses are transmitted in over-the-air (OTA) 
Bluetooth packets. To compute the precise execution time of the 
AT command on the smartphone, we take off the transmission and 
reception times from the total running time. Also, to make sure 
Bluetooth signal strength change does not interfere with the timing 
information, our system keeps track of the RSSI (Received Signal 
Strength Indication) value and carries out the fuzzing at a constant 
RSSI value. 

We define disruption score based on the following four types of 
disruption events: (i) Complete shutdown of SIM card connectiv¬ 
ity; (ii) Complete shutdown of cellular Internet connectivity; (iii) 
Partial disruption in cellular Internet connectivity; (iv) Partial dis¬ 
ruption of SIM card connectivity with the phone. For cases (i) and 
(ii), complete shutdown causes denial of cellular/SIM functionality, 
recovery from which requires rebooting the device. ATFuzzer uses 
adb reboot command which takes ~ 15 - 20 seconds to restart the 
device without entailing any manual intervention. On the contrary, 
partial shutdown for the cases (iii) and (iv) induce denial of cellu¬ 
lar/SIM functionality for ~ 3 — 5 seconds and thus does not call for 
rebooting the device to recuperate back to its normal state. These 
events are detected and monitored using the open-source tools 
available to us from Android, e.g., logcat, dumpsys, and tombstone. 
While injecting the AT commands we use these tools to detect 
the events on run time. We take into account if there is a crash in 
the baseband or the RIL daemon. We assign a score between 0-1 
to a disruption event in which 0 denotes no disruption at all (i.e., 
the device is completely functional) with no adverse effects and 1 
denotes complete disruption of cellular or SIM card connectivities. 
Fitness score of a grammar. After computing the fitness scores 
for all the concrete input instances, we calculate the grammar’s 
score by taking the average of all instance scores. 


4.2.1 Fitness evaluation. At the core of ATFuzzer is the fitness 
function that guides the fuzzing and acts as a liaison for the coverage 
information. We devise our fitness function based on the timing 
information and baseband related information of the smartphone. 
Our fitness function comprises of two parts: (1) Fitness score of 
the test inputs generated from a grammar; (2) Fitness score of the 
grammar in the population. 

Fitness score of the test inputs of a grammar. The fitness eval¬ 
uator of ATFuzzer generates N inputs from each grammar and 
calculates the score for each input. We define this fitness function 
for an input AT command instance x as: 

fitness(x) = a x timing score + (1 — a) x disruption SC0re 
where a is a tunable-parameter that controls the impact of 
timing score an d disruption score . Let t x be the time required for 
executing an AT command x (0 < x < N) on the smartphone 
under test. Execution time of an AT command is defined as the time 


5 EVALUATION 

Our primary goal in this section is to evaluate the effectiveness 
of ATFuzzer by following the best possible practices [26, 55] and 
guidelines [34], We, therefore, first discuss the experiment setup 
and evaluation criteria, and then evaluate the efficacy of our proto¬ 
type against the widely used AFL [59] fuzzer— customized for our 
context. 

5.1 Experiment Setup 

ATFuzzer setup. We implemented ATFuzzer with ~4000 lines of 
Python code. We encoded the grammars (with JSON) for a corpus 
of 90 baseband-related AT commands following the specification in 
the 3GPP [11] documentation and extracting some of the vendor- 
specific AT commands following the work of Tian et al. [53]. During 
its initialization, ATFuzzer receives as input the name of the AT 
command, retrieves the corresponding grammar that will be used 




as the seed (Gat i n algorithm 1) from the file, generates the initial 
grammar population and realizes the proposed crossover and muta¬ 
tion strategies. Hence, our approach is general and easily adaptable 
to other structured inputs, since it is not bound to any specific 
grammar structure. Since testing a concrete AT command instance 
requires 15-20 seconds on average (because of checking the cellu¬ 
lar and SIM card connectivity after executing a command and for 
rebooting the device in case of AT interface’s unresponsiveness for 
blacklisting), we set P s j ze to 10 which we found through empiri¬ 
cal study to be the most suitable in terms of ATFuzzer’s stopping 
condition. Following the same procedure, we test 10 concrete AT 
commands in each round for a given grammar. We set the proba¬ 
bility 'Ppop to 0.5 to ensure uniform distribution in the grammar 
varying ratio. 

Conceptually, one can argue about testing at a “batch” mode 
to chop the average time for fuzzing a AT grammar. For instance, 
injecting 10 AT commands together and then checking the cellular 
and SIM connectivity at once. Though This design philosophy is 
intuitive, but fails to serve our purpose due to the following fact. 
Though it may be able to detect permanent disruptions, it is unable 
to detect temporary disruptions to cellular or SIM connectivity. For 
instance, even if the second AT command in the batch induces a 
temporary disruption, there will be no trace of disruptions at all 
by the time when the tenth (i.e., the last) AT command will be 
executed. 

Target devices. We tested 10 different devices (listed in Table 1) 
from 6 different vendors and with 6 different android versions 
to diversify our analysis. For Bluetooth, we do not require any 
configuration on the phone. For running AT commands over USB 
some phones require additional configuration. For additional details, 


see Appendix A.l. 


Device 

1 Android 
| Version 

1 Build 1 

| Number | 

Baseband 1 
Vendor | 

| Baseband 

USBConfig 

°S 

| Interface 

Samsung 

Note2 

4.3 

JSS15J. 

I9300XU 

GND5 

Samsung 

Exynos 

4412 

N7100DD 

UFND1 

None 

Linux 

Bluetooth 

USB 

Samsung 

Galaxy 

S3 

4.3 

I9300XX 

UGND5 

Samsung 

Exynos 

4412 

19300XX 

UGNA8 



Bluetooth 

and 

USB 

LGG3 

6.0 

MRA58K 

Qualcomm 

dragon 

801 

MPSS.D1.2.0.1. 
cl.13-00114 
-M8974AA 
AAANPZM- 
1.43646.2 



Bluetooth 

and 

USB 

HIG 

Desire 10 
lifestyle 

6.0.1 

1.00.600.1 

8.0_g 

CL800193 

release- 

dragon 

400 

3.0.U205591 
@60906G_01. 
00.U0000. 00_F 

sys.usb.contig 
mtp,adb,diag, 
modem, mo- 

demmdm, 
diag mdm 

Window: 

Bluetooth 

and 

USB 

LG 

Nexus 5 

5.1.1 

LMY48I 

Qualcomm 

dragon 

800 

M8974A- 

2.0.50.2.26 

sys.usb.contig 

diag.adb 


Bluetooth 

and 

USB 

Motorola 
Nexus 6 

6.0.1 

MQB30M 

Qualcomm 

dragon 

805 

MDM9625_ 

104662.22. 

05.34R 

tastboot oem bp- 

Window: 

Bluetooth 

and 

USB 

Huawei 
Nexus 6P 

6.0 

MDA89D 

Qualcomm 

dragon 

810 

.2.6.1.C4- 

00004-M899 

4FAAAAN 

AZM-1 

fastboot oem 
enable-bp-tools 

Window: 

Bluetooth 

and 

USB 

Samsung 

Galaxy 

S8+ 

8.0.0 

R16NW.G95 

5USQU5CRG! 

Qualcomm 

dragon 

835 

G955US 

QU5CRG3 



Bluetooth 

and 

USB 

Huawei 

P8 Lite 
ALE-L21 

5.0.1 

ALE- 

L21CO2B140 

HiSilicon 
Kirin 620 
(28 nm) 

22.126.12.00.00 

N °” 

None 

Bluetooth 

Pixel 2 

8.0.0 

OPD3.1708 

16.012 

Qualcomm 

MSM8998 

dragon 

835 

g8998-00122- 

1708231715 



Bluetooth 


Table 1: List of the devices we tested, with software information, 


USB configuration required and the operating system we used to 
fuzz each device. 


5.2 Evaluation Criteria 

ATFuzzer has three major components— grammar crossover, mu¬ 
tation and feedback loop— to effectively test a target device. We, 
therefore, aim to answer the following research questions to evalu¬ 
ate ATFuzzer: 

. RQ1: How is the bug-finding capability of ATFuzzer over 
Bluetooth? 

. RQ2: How is the bug-finding capability of ATFuzzer over 
USB? 

. RQ3: How effective is our grammar-aware crossover? 

. RQ4: How effective is our grammar-aware mutation? 

. RQ5: When using grammars, how much does the use of 
timing feedback increase fuzzing performance? 

. RQ6: Is ATFuzzer more efficient than other state-of-the-art 
fuzzers for testing AT interface? 

To tackle RQ1-RQ2, we let our ATFuzzer run over USB and Blue¬ 
tooth each for one month to test 10 different smartphones listed 
in Table 1. ATFuzzer has been able to uncover a total of 4 erro¬ 
neous AT grammars inducing a crash, downgrade and information 
leakage over Bluetooth and 13 erroneous AT grammars over USB. 
Based on the type of actions and responses to the problematic AT 
command instances, we initially categorize our results as syntactic 
and semantic problematic AT grammars, and further categorize the 
syntactically problematic grammars into three separate classes— (i) 
responds ok with composite actions; (ii) responds ok with an action; 
and (iii) responds error with an action. Here, an action can be either 
crash (i.e., any disruption event defined in Section 4), leakage of 
any sensitive information, or executing a command, e.g., hanging 
up a phone call. 

We summarize ATFuzzer’s findings for Bluetooth in Table 2 and 
for USB in Table 3. To answer the research questions RQ3-RQ5, 
we evaluate ATFuzzer by disabling one of its components at a time. 
We create three new instances of ATFuzzer— ATFuzzer without 
crossover, ATFuzzer without mutation, and ATFuzzer without fit¬ 
ness evaluation. To what follows we evaluate these three variants 
with the AT grammar (in Figure 3) and compare their efficacy of dis¬ 
covering bugs against original ATFuzzer. Moreover, to answer the 
research question RQ5, we create our variation of AFL (American 
Fuzzy Lop). To perform a fair comparison, we run all our experi¬ 
ments on Nexus5 for each variations of ATFuzzer and our version 
of AFL each for 3 days. 

5.3 Findings Over Bluetooth (RQl) 

Unlike USB, Bluetooth does not require any pre-processing or con¬ 
figuration to the phone to execute AT commands. Besides this, 
over-the-air Bluetooth communications are inherently vulnerable 
to MitM attacks [7, 39, 51]. All these enable the adversary to read¬ 
ily exploit the vulnerabilities over Bluetooth with sophisticated 
attacks. 

5.3. / Resuits. We first discuss the results that relate to invalid AT 
commands and then we discuss the attacks and impacts of both 
invalid and valid AT commands. 

(1) Syntactic errors - responds ok with actions. ATFuzzer un¬ 
covered four problematic grammars in these categories in seven 
different Android smartphones. We observer that the target device 



Class of 

Bugs 

Grammar and Command Instance 

action/implication 

Nexus5 

LG G3 

Nexus6 

Nexus6P 

HTC 

S8plus 

S3 

Note2 

Huawei P81ite 

Pixel 2 

Syntatctic - 
returns OK 
with action 

cmd —»D.Dnum.Dargl.Darg2 

Dnum —*[A — Z0 — 9 + #]* 

Dargl -*I\G\e 

Darg2 —Darg3 

Darg3 ->[A, B, C] + 

E.g. ATD + 46420480341; AB; C 

crash/internet connectiv¬ 
ity disruption 

y 









y 


cmd —»D.Dnum.Dargl.Darg2 

Dnum —>[A - Z0 — 9 + #]* 

Dargl —>I\G\e 

Darg2 —Darg3 

Darg3 ->[A, B, C] + 

E.g. ATD + 46420480341; AB; C 

crash/downgrade 



y 

y 








cmd —»+CIMI.Argl 

Argl —>[a — zA — Z 0 — 9 + #]* 

E.g. AT + CIMI;;;; abc 

read/IMSI leak 






y 

y 

y 




cmd —»+CGSN.Argl 

Argl —>[a — zA — Z0 — 9 + #]* 

E.g. AT + CGSN;;;; abc## 

read/lMEI leak 






V 

y 

y 



Correctly 

cmd —»+CIND? 

read/leaks call status, 

y 

y 

y 

y 


y 

y 

y 

y 

y 

formatted 

command 


call setup stage, internet 
service status, signal 
strength, current roam¬ 
ing status, battery level, 
call held status 












cmd —»+CHUP 

execution (cutting phone 
calls)/ DoS 

y 

y 

y 

y 

y 

y 

y 

y 

y 

y 


cmd —»Arg.D.Dnum.Darg; 

execution/ call forward¬ 

y 

✓ 

y 

/ 

y 

V 

y 

y 

y 

y 


Arg —»[a - zA - Z] 

Dnum —>[<2 — zA - Z0 - 9 + *#]* 

Darg —>I\G\e 

E.g. ATD * *61 * +1812555673 * 11 * 25#; 

ing, activating do not dis¬ 
turb mode, activating se¬ 
lective call blocking 












Table 2: Summary of ATFuzzer’s Bluetooth parser findings. 


responds to the invalid AT command and also performs an action. 
For instance, ATFuzzer found a specific variant of ATD grammar 
ATDA; A; B in Nexus5 which is syntactically incorrect, but returns 
OK and make the cellular Internet connectivity temporarily un¬ 
available. Beside this, the concrete instances of the same grammar 
also downgrade the cellular connectivity from 4G to the 3G/2G in 
Nexus6 and Nexus6P smartphones thus entails severe security and 
privacy impacts. 

5.3.2 Attacks with invalid AT commands. We now present three 
practical attacks that can be carried out using the invalid grammars 
uncovered through ATFuzzer. 

Denial of service. The adversary using a malicious Bluetooth 
peripheral device (e.g., Bluetooth headphone with only call audio 
and media permissions) or a MitM instance may exploit the invalid 
AT command, e.g., ATDB; A; B and temporarily disrupt the Internet 
connectivity of the Pixel 2 and Nexus 5 phones. To cause long term 
disruptions in Internet connectivity, the adversary may inject this 
command intermittently and thus prevent the user from accessing 
the Internet. Note that there is no other valid AT command that 
controls the Internet connectivity over Bluetooth and thus it is not 
possible to achieve this impact using a valid AT command. 
Downgrade. The same invalid grammar (shown in table 2) ex¬ 
ploited in the previous DoS attack in Nexus 5 phone can also be 
exploited to downgrade the cellular connectivity on Nexus6 and 
Nexus6P phones. Similar to previous DoS attack, such downgrade 
of cellular connectivity is not possible with any valid AT commands 
running over Bluetooth. Downgrade (also known as bidding-down) 
attacks have catastrophic implications as they open the avenue 
to perform over-the-air man-in-the-middle attacks in cellular net¬ 
works [25, 40]. 

IMSI & IMEI catching. ATFuzzer uncovered the invalid varia¬ 
tions (AT + CIMI;;;;; abc and AT + CGSN123df) of two valid AT 
commands (+CIMI and +CGSN) which enable the adversary to 
illegitimately capture the IMSI and IMEI of a victim device over 


Bluetooth. Exploiting this, any Bluetooth peripheral connected to 
the smartphone can surreptitiously steal such important personal 
information of the device and the network. We have successfully 
validated this attack in Samsung Galaxy S3, Samsung Note 2 and 
Samsung Galaxy S8+. One thing to be noted here, after manual 
testing we found out the valid versions of these two commands 
also leak IMSI and IMEI. We argue that even if there is a black¬ 
list/firewall policy put into place to stop the leakage through valid 
AT commands, yet it will not be sufficient because it will leave the 
scope to use the invalid versions of the command (that ATFuzzer 
uncovered) to expose this sensitive information. 

The impact of this attack is particularly more fatal than that 
of the previous two attacks. This is because the illegitimate ex¬ 
posure of IMSI and IMEI through Bluetooth provides an edge to 
the adversary to further track the location of the user or intercept 
phone calls and SMS using fake base stations [32, 33] or MitM re¬ 
lays [50], Samsung has already acknowledged the vulnerabilities 
and is working on issuing patches to the affected devices. We also 
summarize the findings of ATFuzzer in Table 2. CVE-2019-16401 [6] 
has been assigned to this vulnerability along with other sensitive 
information leakage for the affected Samsung devices. 

5.3.3 Attacks with valid AT commands. We summarize ATFuzzer’s 
other findings in which we demonstrate that the exposed AT inter¬ 
face over Bluetooth allows the adversary to run valid AT commands 
to attain malicious goals that may negatively affect a device’s ex¬ 
pected operations. The results are particularly interesting as Blue¬ 
tooth interface has yet not been systematically examined. 
Information leak. The adversary can use a valid AT command to 
learn the whole set of private information about the phone. The 
malicious Bluetooth peripheral device can get the call status, call 
setup state, Internet service status, the signal strength of the cellular 
network, current roaming status, battery level, and call hold status 
for the phone using this valid AT commands. 




DoS attacks. A malicious peripheral can exploit the AT + CHUP 
command to prevent the victim device to receive any incoming 
phone call. From the previous information leakage (e.g., call status) 
attack, an attacker can probe periodically to detect whether there 
is a phone call or not. Whenever he detects there is a phone call, 
the attacker injects AT + CHUP to cut the phone call. To make the 
matters worse, the attack is transparent to the victim, i.e., there 
is no indication on the mobile screen that an attack is going on. 
The victim device user perceives either there is no incoming call or 
abrupt call drops due to poor signal quality or network congestions. 
CVE-2019-16400 [5] has been assigned for this along with other 
reported denial of service attacks in Samsung phones. 

Call forwarding. If the victim device is subscribed to call for¬ 
warding service, the adversary may exploit the ATD command to 
forward victim device’s incoming calls to an attacker-controlled 
device. Exploiting this, the adversary first prevents the victim de¬ 
vice from receiving the incoming calls and then learns sensitive 
information, such as password or pin for two-factor authentication 
possibly sent by an automated teller. Note that such call forwarding 
is also transparent to the user since the user is unaware of any 
incoming calls. 

Activating do not disturb mode. The adversary using a malicious 
Bluetooth peripheral or MitM instance can turn on the do not dis¬ 
turb mode of the carrier through ATD command. Similar to call 
forwarding attack, it is also completely transparent to the user as 
no visible indication of do not disturb mode is displayed on the 
device. While the user observes all the network status bars and the 
Internet connectivity, the device, however, does not receive any call 
from the network. 

Selective call blocking. A variation of the previous attack is also 
possible in which the adversary may allow the victim phone to 
receive selective calls by intermittently turning on/off the do not 
disturb mode. This may force the user to receive calls only from 
selective users not affecting others. 

5.4 Findings over USB (RQ2) 

We now discuss findings over USB. 

(1) Syntactic errors - responds ok with composite actions. It is 

one of the interesting classes of problematic grammars for which the 
AT interface of the affected devices respond to invalid AT commands 
with ok, but performs multiple actions together. These invalid com¬ 
mands are compositions of invalid characters and two valid AT com¬ 
mands with no semicolon as their separator. For instance, ATFuzzer 
generated an invalid command ATI HD + 4632048034; using two 
valid grammars for ATD and ATI (as shown in Figure 3) and in¬ 
valid characters for which the target device returns ok but places 
a phone call to 4632048034 and shows the manufacturer, model 
revision, and IMEI information simultaneously. 

(2) Syntactic errors - responds ok with an action. In this type 
of syntactically problematic grammars, the target device responds 
to an invalid command instance with ok but performs an action. 
For instance, the grammar cmd —>Argl. I .Arg2 in Table 3 can 
be instantiated with an invalid command instance ATHIX which 
returns sensitive device information. 

(3) Syntactic errors - responds error with an action. In this 
class of syntactic errors, the AT interface recognizes the inputs as 


faulty by acknowledging with error, but it still executes the action 
associated with the command and even does worse by crashing 
the RIL daemon and inducing complete disruptions in the cellular 
Internet connectivity. It basically reveals a fundamental flaw in the 
AT interface— if a command is considered as erroneous, it should 
not be executed. For example, the grammar cmd —>D . Dnum in 
Table 3 can be instantiated with ATD+4632048034 (a variation of 
the ATD production rule in Figure 3) which is supposed to start a 
cellular voice call. Instead, the grammar returns error in the form of 
NO CARRIER and induces the cellular Internet connectivity to go 
down completely for a certain amount of time (15-20 seconds). We 
have also found grammars for which the device returns other error 
statuses, e.g., ERROR, NO CARRIER, CME ERROR, ABORTED, and 
NOT SUPPORTED, but still executes those invalid commands. 

(4) Semantic errors. This class of grammars conforms with the 
input pattern defined by the standards [11], but induces disrup¬ 
tions in the cellular connectivity for which the recovery requires 
rebooting the device. The grammars of this class are shown in Table 
3. 

Possible exploitation. It may appear that the implications of in¬ 
valid AT commands over USB are negligible as compared to the 
valid AT commands which may wreak havoc by taking full control 
of the device. We, however, argue that if AT interface exposure 
is restricted through blacklisting the critical and unsafe valid AT 
commands by the parser in the first place, the adversary will still 
be able to induce the device to perform same semantic function¬ 
alities using invalid AT commands. This is due to the uncovered 
vulnerabilities for which the parser will fail to identify the invalid 
AT commands as the blacklisted commands and thus allows the 
adversary to achieve same functionalities as the valid ones. 

5.4.1 Efficacy of grammar-aware crossover (RQ3). ATFuzzer with¬ 
out crossover (by disabling the crossover in ATFuzzer) uncovered 
only 3 problematic grammars as compared to ATFuzzer with all 
proposed crossover and mutations (Table 4). This is due to the fact 
that ATFuzzer without crossover cannot induce enough changes in 
the structure and type of the arguments of parent grammars, as a 
result of which it reduces the search space. 

5.4.2 Efficacy of grammar-aware mutation (RQ4). Since ATFuzzer 
without mutation cannot induce changes in the arguments and the 
respective conditions, it uncovered only 2 problematic grammars. 
ATFuzzer without crossover, however, performs slightly better than 
that of the ATFuzzer without crossover. This also justifies our in¬ 
tuition that mutation strategies play a vital role in any fuzzer as 
compared to crossover techniques. Without mutation, a fuzzer un¬ 
likely generates interesting inputs for the system under test. 

5.4.3 Efficacy of timingfeedback (RQ5). We observed that ATFuzzer 
without feedback performs better than the other two (RQ2 and RQ3) 
variations. ATFuzzer without feedback uncovered 5 problematic 
grammars and thus is less effective than ATFuzzer with feedback. 
AT interface being a complete black box with little to no feedback 
we had to resort to various creative ways including timing informa¬ 
tion to generate feedback score. However, This resorts to an upper 
bound for the coverage information and loosely dictates ATFuzzer. 

5.4.4 Comparison with other state-of-the-art fuzzer (RQ6). We com¬ 
pare the effectiveness of ATFuzzer against AFL (American Fuzzy 
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Table 3: Summary of ATFuzzer’s findings over USB. 


Lop) [59]. Since current versions of AFL require instrumenting the 
test programs, we implemented a modest string fuzzer that adopts 
five mutation strategies ( walking bit flips, walking byte flips, known 
integers, block deletion and block swapping ) employed by AFL and 
incorporated our proposed timing-based feedback loop to it. We 
evaluate this AFL variant with 80 different seeds (consisting of valid 


and invalid command instances of randomly chosen 40 different 
AT reference grammars). 

Table 4 shows that the AFL variant uncovered 2 different prob¬ 
lematic grammars whereas ATFuzzer uncovers 9 unique grammars 
after running for 3 days. Though we decided to compare our tool 
with AFL, which is the best choice we had as AFL is considered the 
state-of-the-art tool for fuzzing, we do not claim the comparison to 
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ATFuzzer 

9 

Al Fuzzer w/o feedback 

5 

ATFuzzer w/o crossover 

3 

ATFuzzer w/o mutation 

2 

Modified AFL 

2 


Table 4: Result obtained with different fuzzing approaches on 
Nexus5 over a period of 3days for each approach. 

be ideal. Because AFL relies heavily on code average information 
and for our case, we replaced the coder coverage with the best 
available substitute, i.e., coarse-grained timing information as a 
loose indicator to code coverage. We acknowledge that this is a 
best-effort approach and the evaluation may be sub-deal. 

6 RELATED WORK 

In this section, we mainly discuss the relevant work on the following 
four topics: AT commands, mutation-based fuzzing, grammar-based 
mutation, grammar-based generation. 

AT commands. Most of the previous work related to AT com¬ 
mands follow investigate how an adversary can misuse valid AT 
commands to attack various systems. The work from Tian et al. 
[53] can be considered the most relevant to our work, however, it is 
significantly different in the following three aspects: (i) Firstly, they 
only show the impact of AT commands over USB as they consider 
the functionality and scope of AT commands over Bluetooth too lim¬ 
ited to study. We, however, demonstrate the dire consequences of 
AT commands over Bluetooth interface with the uncovered invalid 
and valid AT commands, (ii) Secondly, they only show the impact of 
valid AT commands whereas we demonstrate the impact of invalid 
AT commands exploring different attack surfaces, (iii) Finally, one 
of the primary objectives of our work is to test the robustness of the 
AT interface, which is a different and complimentary end objective 
than theirs. 

BlueBug [24] exploits a Bluetooth security loophole on few 
Bluetooth-enabled cell phones and issues AT commands via a covert 
channel. It, however, relies on the Bluetooth security loophole to 
attack and does not apply to all phones. In contrast, we have demon¬ 
strated a variety of attacks using valid and invalid AT commands 
running over Bluetooth which do not rely on any specific Bluetooth 
assumptions and also applicable to all the modern smartphones 
we had in our corpus. Injecting AT commands on android base¬ 
band was previously discussed on the XDA forum [23], Pereira 
et al. [43, 45] used AT commands to flash malicious images on 
Samsung phones. Hay [29] discovered that AT interface can be 
exploited from Android bootloader and discovered new commands 
and attacks using the AT interface. AT commands have been used to 
exploit modems other than smartphones as well. Most prominently, 
USBswitcher [44, 49] and [43] demonstrate how these commands 
expose actions potentially causing security vulnerabilities in smart¬ 
phones. Some other work use AT commands as a part of their tool, 
for instance, Mulliner et al. [42] use the AT commands as feedback 
while fuzzing SMS of phones. Xenakis et al. [57, 58] devise a tool 
using AT commands to steal sensitive information from baseband. 
None of them, however, actually analyzes or discovers bugs in the 
AT parser itself. 


Mutation based fuzzers. Initial mutation-based fuzzers [41] used 
to mutate the test inputs randomly. To make this type of fuzzers 
more effective, a huge amount of work has been carried out to de¬ 
velop sophisticated techniques to improve mutation strategies— cov¬ 
erage information through instrumenting the binary [28, 36, 37, 59]; 
resource usage information [35, 46]; control and data flow features 
[48]; static vulnerability prediction models [38]; data-driven seed 
generation [55]; high-level structural representation of seed file 
[47]. There are also a few mutation-based fuzzers that incorporate 
the idea of grammars rather than inputs. Wang et al. [56] use gram¬ 
mars to guide mutation whereas Aschermann et al. [26] rely on 
code coverage feedback. Simulated annealing power schedule with 
genetic fuzzing has also been incorporated in [27]. However, due 
to the black-box nature of our system and structural pattern of AT 
command inputs, none of the existing concepts suffice fuzzing AT 
parser. 

Generation-based fuzzers. Generation based fuzzers generate 
inputs based on a model [19, 20, 31, 54], specification or defined 
grammars. However, to the best of our knowledge, no fuzzer discov¬ 
ers a class of bugs at the grammar-level, rather generates concrete 
input instances. There are also some generation-based, more pre¬ 
cisely, defined grammar-based fuzzers [16] [15] which use manually 
specified grammars as inputs. For instance, Mangeleme is an auto¬ 
mated broken HTML generator and fuzzer, and Jsfunfuzz [15] uses 
specific knowledge about past and present vulnerabilities and uses 
grammar rules to produce inputs that may cause problems. Both of 
them are, however, random fuzzers. 

7 DISCUSSION 

Defenses. Our findings show that current implementations of base¬ 
band processors and AT command interfaces fail to correctly parse 
and filter out some of the possible anomalous inputs. In this paper, 
we do not explicitly explore defenses for preventing malicious users 
from exploiting these flaws. However, we show that restricting the 
AT interface through access control policies, black-listing should 
not work due to the parsing bugs and invalid AT commands, that 
the parser executes. Completely removing the exposure of AT mo¬ 
dem interface over Bluetooth and USB can resolve the problem. 
Other than that at a conceptual level, having a formal grammar 
specification of the supported AT command grammar may provide 
a better way to test implementations of the AT interface. Another 
aspect that requires particular attention is the deployment of stricter 
policies that filter out anomalous AT commands. 

Responsible disclosure. Given the sensitive nature of our find¬ 
ings, we have reported these to the relevant stakeholders (e.g., 
respective modems and devices vendors and manufacturers). More¬ 
over, following the responsible disclosure policy we have waited 
90 days before making our findings public. Currently to our knowl¬ 
edge Samsung has been working to issue a patch to mitigate the 
vulnerabilities. 

8 CONCLUSION AND FUTURE WORK 

The paper proposes ATFuzzer for testing the correctness of the 
AT interface exposed by the baseband processor in a smartphone. 
Towards this goal, ATFuzzer leverages a grammar-guided evolu¬ 
tionary fuzzing-based approach. Unlike generational fuzzers which 



use the input grammar to generate syntactically correct inputs, 
ATFuzzer mutates the production rules in the grammar itself. Such 
an approach enables ATFuzzer to not only efficiently navigate the 
input search space but also allows it to exercise a diverse set of 
input AT commands. 

Future work. In the future, we want to apply hybrid fuzzing in 
our problem domain. In the hybrid fuzzing paradigm, a black-box 
fuzzer’s capabilities are enhanced through the use of lightweight 
static analysis (e.g., dynamic symbolic execution, taint analysis). 
Such an approach would, however, require us to address the issues 
concerning firmware binaries’ practice of employing obfuscation 
and encryption. 

ACKNOWLEDGEMENT 

We thank the anonymous reviewers for their suggestions. This 
work is supported by NSF grants CNS-1657124 and CNS-1719369, 
Intel, and a grant by Purdue Research Foundation. 

REFERENCES 

[1] [n.d.]. AT Commands For CDMA Wireless Modems. http://www.canarysystems. 
com/nsupport/CDMA_AT_Commands.pdf. 

[2] [n.d.]. CVE-2016-4030. 
https://nvd.nist.gov/vuln/detail/CVE-2016-4030. 

[3] [n.d.]. CVE-2016-4031. 
https://nvd.nist.gov/vuln/detail/CVE-2016-4031. 

[4] [n.d.]. CVE-2016-4032. 
https://nvd.nist.gov/vuln/detail/CVE-2016-4032. 

[5] [n.d.]. CVE-2019-16400. 

https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE- 2019-16400. 

[6] [n.d.]. CVE-2019-16401. 

https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE- 2019-16401. 

[7] [n.d.]. CWE-325: Missing Required Cryptographic Step - CVE-2018-5383. In 
Cernegie Mellon University,CERT Coordination Center, https://www.kb.cert.org/ 
vuls/id/304725/. 

[8] [n.d.]. Digital cellular telecommunications system (Phase 2+); AT Command 

set for GSM Mobile Equipment (ME) (3GPP TS 07.07 version 7.8.0 Release 
1998). https://www.etsi.org/deliver/etsi_ts/100900_100999/100916/07.08.00_ 

60/ts_100916v070800p.pdf. 

[9] [n.d.]. Digital cellular telecommunications system (Phase 2+) (GSM); Universal 
Mobile Telecommunications System (UMTS); LTE; AT command set for User 
Equipment (UE) (3GPP TS 27.007 version 13.6.0 Release 13). https://www.etsi. 
org/deliver/etsi_ts/127000_127099/127007/13.06.00_60/ts_127007vl30600p.pdf. 

[10] [n.d.]. Digital cellular telecommunications system (Phase 2+); Specification of 
the Subscriber Identity Module -Mobile Equipment (SIM-ME) interface (3GPP TS 
51.011 version 4.15.0 Release 4). https://www.etsi.org/deliver/etsi_TS/151000_ 
151099/151011/04.15.00_60/ts_15101 lv041500p.pdf. 

[11] [n.d.]. Digital cellular telecommunications system (Phase 2+), Universal Mobile 
Telecommunications System UMTS, LTE, AT command set for User Equipment 
UE. https://www.etsi.org/deliver/etsi_ts/127000_127099/127007/10.03.00_60/ts_ 
127007vl00300p.pdf. 

[12] [n.d.]. Digital cellular telecommunications system (Phase 2+); Use of Data Ter¬ 
minal Equipment - Data Circuit terminating; Equipment (DTE - DCE) interface 
for Short Message Service (SMS) and Cell Broadcast Service (CBS) (GSM 07.05 
version 5.3.0). https://www.etsi.org/deliver/etsi_gts/07/0705/05.03.00_60/gsmts_ 
0705v050300p.pdf. 

[13] [n.d.]. EVDO and CDMA AT Commands Reference Guide. https://www. 
multitech.com/documents/publications/manuals/s000546.pdf. 

[14] [n.d.]. HUAWEIMU609 HSPA LGA Module Application Guide. 

http s: // www.paoli.cz/out/me dia/HUAWEI_MU609_HSPA_LG A_Module_ 
Application_Guide_V100R002_02(l).pdf. 

[15] [n.d.]. jsfunfuzz [online]. 

https://github.com/MozillaSecurity/funfuzz/tree/master/sre/funfuzz/j s/ 
jsfunfuzz. 

[16] [n.d.]. Mangleme [Online]. 

https://github.com/WebKit/webkit/tree/master/Tools/mangleme. 

[17] [n.d.]. Motorola AT Command Set. https://ipfs.io/ipfs/ 

QmXoypizjW3WknFiJnKLwHCnL72vedxjQkDDPlmXWo6uco/wiki/Motorola_ 
phone_AT_commands.html. 

[18] [n.d.]. Neo 1973 and Neo FreeRunner GSM modem, AT Command set. http: 
//wiki.openmoko.org/wiki/Neo_1973_and_Neo_FreeRunner_gsm_modem. 


[19] [n.d.]. Peach Fuzzer Platform [online], https://www.peach.tech/. 

[20] [n.d.]. Radamsa [online], https://gitlab.com/akihe/radamsa. 

[21] [n.d.]. Sony Erricsson AT Command set. https://www.activexperts.com/sms- 
component/at/sonyericsson/. 

[22] [n.d.]. Wikipedia. 

https://en.wikipedia.org/wiki/Hayes_command_set. 

[23] [n.d.]. XDA Forum [online]. 

https://forum.xda- developers.com/galaxy- s2/help/how- to- talk- to- modem- 
commands- tl471241. 

[24] M.Herfurt A. Laurie, M. Holtmann. [n.d.]. The bluebug. AL Digital Ltd. https: 
//trifinite.org/trifinite_stuff_bluebug.html#introduction. 

[25] Iosif Androulidakis. 2011. Intercepting mobile phone calls and short messages 
using a gsm tester. In International Conference on Computer Networks. Springer, 
281-288. 

[26] Cornelius Aschermann, Tommaso Frassetto, Thorsten Holz, Patrick Jauernig, 
Ahmad-Reza Sadeghi, and Daniel Teuchert. 2019. NAUTILUS: Fishing for Deep 
Bugs with Grammars. In Proceedings of the Network and Distributed System 
Security Symposium (NDSS). 

[27] Marcel Bohme, Van-Thuan Pham, Manh-Dung Nguyen, and Abhik Roychoudhury. 

2017. Directed greybox fuzzing. In Proceedings of the 2017 ACM SIGSAC Conference 
on Computer and Communications Security. ACM, 2329-2344. 

[28] S. Gan, C. Zhang, X. Qin, X. Tu, K. Li, Z. Pei, and Z. Chen. [n.d.]. CollAFL: Path 
Sensitive Fuzzing. In 2018 IEEE Symposium on Security and Privacy (SP), Vol. 00. 
660-677. https://doi.Org/10.l 109/SP.2018.00040 

[29] Roee Hay. 2017. fastboot oem vuln: android bootloader vulnerabilities in vendor 
customizations. In 11th { USENIX} Workshop on Offensive Technologies ({WOOT} 
17). 

[30] Roee Hay and Michael Goberman. 2017. Attacking Nexus 6 & 6P Custom 
Bootmodes. (2017). https://www.docdroid.net/dxKUj5c/attacking-nexus-6- 
6p- custom- bootmodes.pdf. 

[31] Christian Holler, Kim Herzig, and Andreas Zeller, [n.d.]. Fuzzing with Code 
Fragments. 

[32] Syed Rafiul Hussain, Omar Chowdhury, Shagufta Mehnaz, and Elisa Bertino. 

2018. LTEInspector: A Systematic Approach for Adversarial Testing of 4G LTE. 
In 25th Annual Network and Distributed System Security Symposium, NDSS, San 
Diego, CA, USA, February 18-21. 

[33] Syed Rafiul Hussain, Mitziu Echeverria, Omar Chowdhury, Ninghui Li, and Elisa 
Bertino. 2019. Privacy Attacks to the 4G and 5G Cellular Paging Protocols Using 
Side Channel Information. (2019). 

[34] George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. 
Evaluating Fuzz Testing. In Proceedings of the 2018 ACM SIGSAC Conference on 
Computer and Communications Security (CCS ’18). ACM, New York, NY, USA, 
2123-2138. https://doi.org/10.1145/3243734.3243804 

[35] Caroline Lemieux, Rohan Padhye, Koushik Sen, and Dawn Song. 2018. PerfFuzz: 
Automatically Generating Pathological Inputs. In Proceedings of the 27th ACM 
SIGSOFTInternational Symposium on Software Testing and Analysis (ISSTA 2018). 
ACM, New York, NY, USA, 254-265. https://doi.org/10.1145/3213846.3213874 

[36] Caroline Lemieux and Koushik Sen. 2018. FairFuzz: A Targeted Mutation Strat¬ 
egy for Increasing Greybox Fuzz Testing Coverage. In Proceedings of the 33rd 
ACM/IEEE International Conference on Automated Software Engineering (ASE 2018). 
ACM, New York, NY, USA, 475-485. https://doi.org/10.1145/3238147.3238176 

[37] Yuekang Li, Bihuan Chen, Mahinthan Chandramohan, Shang-Wei Lin, Yang Liu, 

and Alwen Tiu. 2017. Steelix: Program-state Based Binary Fuzzing. In Proceedings 
of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 
2017). ACM, New York, NY, USA, 627-637. https://doi.org/10.1145/3106237. 

3106295 

[38] Yuwei Li, Shouling Ji, Chenyang Lv, Yuan Chen, Jianhai Chen, Qinchen Gu, and 
Chunming Wu. 2019. V-Fuzz: Vulnerability-Oriented Evolutionary Fuzzing. CoRR 
abs/1901.01142 (2019). arXiv:1901.01142 http://arxiv.org/abs/1901.01142 

[39] Angela Lonzetta, Peter Cope, Joseph Campbell, Bassam Mohd, and Thaier Haya- 
jneh. 2018. Security vulnerabilities in Bluetooth technology as used in IoT. Journal 
of Sensor and Actuator Networks 7, 3 (2018), 28. 

[40] Ulrike Meyer and Susanne Wetzel. 2004. A man-in-the-middle attack on UMTS. 
In Proceedings of the 3rd ACM workshop on Wireless security. ACM, 90-97. 

[41] Barton P. Miller, Louis Fredriksen, and Bryan So. 1990. An Empirical Study 
of the Reliability of UNIX Utilities. Commun. ACM 33, 12 (Dec. 1990), 32-44. 
https://doi.org/10.1145/96267.96279 

[42] Collin Mulliner and Charlie Miller. 2009. Fuzzing the Phone in your Phone (Black 
Hat USA 2009). 

[43] Andre Pereira, Manuel Correia, and Pedro Brandao. 2014. Charge your device 
with the latest malware.. In BlackHat Europe. 

[44] Andre Pereira, Manuel Correia, and Pedro Brandao. 2014. USB Connection 
Vulnerabilities on Android Smartphones: Default and Vendors’ Customizations. 
In Communications and Multimedia Security, Bart De Decker and Andre Zuquete 
(Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 19-32. 

[45] Andre Pereira, Manuel Correia, and Pedro Brandao. 2014. USB connection 
vulnerabilities on android smartphones: Default and vendorsaAZ customizations. 



In IFIP International Conference on Communications and Multimedia Security. 
Springer, 19-32. 

[46] Theofilos Petsios, Jason Zhao, Angelos D. Keromytis, and Suman Jana. 2017. 
SlowFuzz: Automated Domain-Independent Detection of Algorithmic Complexity 
Vulnerabilities. CoRR abs/1708.08437 (2017). arXiv: 1708.08437 http://arxiv.org/ 
abs/1708.08437 

[47] Van-Thuan Pham, Marcel Bohme, Andrew E. Santosa, Alexandru Razvan Caci- 
ulescu, and Abhik Roychoudhury. 2018. Smart Greybox Fuzzing. CoRR 
abs/1811.09447 (2018). arXiv:1811.09447 http://arxiv.org/abs/1811.09447 

[48] Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida, 
and Herbert Bos. 2017. Vuzzer: Application-aware evolutionary fuzzing. In 
Proceedings of the Network and Distributed System Security Symposium (NDSS). 

[49] P. Roberto and F. Aristide. 2014. Modem interface exposed via USB.. In BlackHat 
Europe, https://github.com/ud2/advisories/tree/master/android/samsung/nocve- 
2016-0004. 

[50] David Rupprecht, Katharina Kohls, Thorsten Holz, and Christina Popper, [n.d.]. 
Breaking LTE on layer two. 

[51] Mike Ryan. 2013. Bluetooth: With low energy comes low security. In Presented 
as part of the 7th {USENIX} Workshop on Offensive Technologies. 

[52] Wireless Solutions Telit, [n.d.]. AT Commands Reference Guide. 
https://www.telit.com/wp-content/uploads/2017/09/Telit_AT_Commands_ 
Reference_Guide_r24_B.pdf. 

[53] Dave (Jing) Tian, Grant Hernandez, Joseph I. Choi, Vanessa Frost, Christie 
Raules, Patrick Traynor, Hayawardh Vijayakumar, Lee Harrison, Amir Rahmati, 
Michael Grace, and Kevin R. B. Butler. 2018. ATtention Spanned: Comprehen¬ 
sive Vulnerability Analysis of AT Commands Within the Android Ecosystem. In 
27th USENIX Security Symposium (USENIX Security 18). Baltimore, MD, 273-290. 
https://www.usenix.org/conference/usenixsecurityl8/presentation/tian. 

[54] Spandan Veggalam, Sanjay Rawat, Istvan Haller, and Herbert Bos. 2016. IFuzzer: 
An Evolutionary Interpreter Fuzzer Using Genetic Programming. In Computer 
Security - ESORICS 2016, Ioannis Askoxylakis, Sotiris Ioannidis, Sokratis Katsikas, 
and Catherine Meadows (Eds.). Springer International Publishing, Cham, 581- 
601. 

[55] J. Wang, B. Chen, L. Wei, and Y. Liu. 2017. Skyfire: Data-Driven Seed Generation 
for Fuzzing. In 2017 IEEE Symposium on Security and Privacy (SP). 579-594. https: 
//doi.org/10.1109/SP.2017.23 

[56] Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2018. Superion: Grammar- 
Aware Greybox Fuzzing. CoRR abs/1812.01197 (2018). arXiv:1812.01197 http: 
//arxiv.org/abs/1812.01197 

[57] Christos Xenakis and Christoforos Ntantogian. 2015. Attacking the baseband 
modem of mobile phones to breach the users’ privacy and network security. 
In Cyber Conflict: Architectures in Cyberspace (CyCon), 2015 7th International 
Conference on. IEEE, 231-244. 

[58] Christos Xenakis, Christoforos Ntantogian, and Orestis Panos. 2016. (U) SimMon- 
itor: A mobile application for security evaluation of cellular networks. Computers 
& Security 60 (2016), 62-78. 

[59] M. Zalewski. [n.d.]. American fuzzy lop. [online], http://lcamtuf.coredump.cx/afl/. 


A APPENDIX 

A.l Target Devices Configuration. 

In this section, we provide additional detailed information about 
the required set up for the devices we tested. 

Some of the devices we tested expose their modem functional¬ 
ity by default and therefore required no additional configuration 
(also listed in Table 1). On the other hand, for the devices that do 
not expose any modem, it was necessary to root them and set a 
specific type of USB configuration. The USB configuration can be 
changed by setting sys.usb.config property. All the devices can be 
accessed through ADB (Android Debug Bridge) and Fastboot tools. 
With ADB it is possible to access the device’s file system, reboot 
it in different modes, such as bootloader mode, rooting it, and 
finally change the device’s properties directly with the command 
setprop <property-name> <value>. With fastboot, it is possible 
to operate the device in bootloader mode, install new partitions 
and change pre-boot settings required for rooting. For LG Nexus 5, 
we had to set sys.usb.config from the default “mnt.adb” to “diag.adb” 
through adb shell. This setting allows to access the phone in 


diagnostic mode and therefore to communicate with the AT com¬ 
mand interface. For Motorola Nexus 6 and Huawei Nexus 6P, the 
USB configuration can be changed by first rebooting the phone in 
bootloader mode and then issuing the command “fastboot oem 
bp-tools-on” and “fastboot oem enable-bp-tools” to Nexus 6 
and Nexus 6P, respectively as reported in [30], After establishing 
serial communication with the device, it is possible to communicate 
with the smartphone through the AT interface. 



